

# EEDG/CE 6370 Design and Analysis of Reconfigurable Systems Homework 6 High-Level Synthesis

# 1. Laboratory Objectives

- Learn how to use a commercial HLS tool.
- Perform the 3 main HLS steps using the commercial HLS tool:
  - o Resource allocation
  - o Resource Scheduling
  - o Binding
- Verify that the design works as expected.
- Learn how to use the automatic testbench generator.
- Create input stimuli files.
- Simulate using the cycle-accurate model generator.
- Perform a simulation using the same input vectors as for the SW verification for the RTL sim

# 2. Summary

• In this lab you will learn to use a commercial HLS tool synthesizing a design from C into RTL and verifying it.

# 3. Tool Requirement

• NEC CyberWorkbench

Connect to your UTD Nomachine account and edit your ~/.bashrc file in your home directory adding the following:

### # NEC CWB

export CYBER PATH=/proj/cad/cwb-6.1

export CYBER ADMIN PATH=\${CYBER PATH}/linux x86 64/admin

export CYBER SYSTEMC HOME=\${CYBER PATH}/osci

export CYBER LIB=\${CYBER PATH}/lib

export LD\_LIBRARY\_PATH=\${CYBER\_PATH}/lib:\${LD\_LIBRARY\_PATH}

export CYLMD LICENSE FILE=27010@legolas1.utdallas.edu

export PATH=\$PATH:\${CYBER PATH}/bin

export PATH=\$PATH:\${CYBER ADMIN PATH}/bin

Test that it works by entering the following command at the terminal window:

% cwb &

This should launch CyberWorkBench (CWB).

## 4. Average of 8 Project

## Creating a new empty project

- 1. Open CWB (either clicking on the CWB icon on the desktop or on linux %cwb &) /proj/cad/cwb-6.1/bin/cwb
- 2. Create a new project and workspace called ave8 (File→New)
- 3. Set the project parameters as shown in the figure:





The main items that need to be specified are:

- Clock period (2000 x 1/100ns = 20ns  $\Leftrightarrow$  50 MHz)
- o Target device Cyclone V 5CSEMA5F31C6 device
- o Logic Synthesis tool → Quartus II
- o Select the Cyclone V libraries which contain the area and delay information of the basic operations
  - BLIB (Basic Library): contains the delay and area information of basic logic gates (AND, OR,etc..) and muxes of different sizes (cycloneV.BLIB)
  - FLIB (Functional Unit Libirary): Contains the area and delay information of Functional units of different sizes (e.g. 2-4-6-8-12 bit adder) (cycloneV.FLIB)
- 4. Click on Apply to generate the empty project

# Editing or Inserting a ANSI-C description for synthesis

1. Right click on source  $\rightarrow$  Import  $\rightarrow$  ave8.c



Parse source code to check if there are any syntax errors. The console window will indicate if any erros have occurred and where. A 'light-blue' folder with an .IFF file will be generated if the parsing was successful



# Setting up Synthesis mode and other constraints

1. Right click on 'light-blue' folder → Synthesis options



- 2. Set the scheduling mode to 'Automatic scheduling'. This implies that CWB will automatically time the C description at the scheduling phase. Manual scheduling implies that the user will manually time the description using the '\$' sign as a clock boundary.
- 3. Other synthesis options that control the synthesis process can also be set here as well as the target clock frequency.
- 4. Click on 'Apply' and OK to accept the changes.

## **Create the FU constraint files (Resource Allocation)**

1. Righ-click on 'light-blue' folder → Generate Constraint → Generate FLIB/FCNT to open a dialog window that allows the generation of the resource constraint file



2. Increase the 'Large Unit' field to 10 as shown in the dialog window.



CWB considers FUs of different bws as Large (<=8 bits), middle (8bits<=4 bits) and small (>4 bits). By default it limits the number of FUs allocated as indicated in the dialog window to 5:10:20

3. Click 'Generate'. Two new files will be generated:

ave8-auto.FLIB: FU Library file containing FUs area and delay of FUs not contained in the initial FLIB/BLIB files (empty in most cases like in this)

**ave8-auto.FCNT**: Constraint file limiting the number of FUs that he synthesizer can instantiate in paralle.

Review the contents of the files by 'right-click  $\rightarrow$  open by internal editor



# **High-Level Synthesis**

The conversion of the C program into RTL (Verilog or VHDL) can now take place:

1. Select Verilog at the toolbar



2. Righ-click on 'light-blue' folder → Synthesize/Generate RTL (or toolbar)



A report file (Quality of Results-QoR) will be generated and appears automatically. It contains all the synthesis information: FPGA resources used, critical path, design latency, etc...

**NOTE:** The information reported is based on the FLIB and BLIB file provided. It is therefore important to provide the correct library files or to re-generat these if a different device is targeted. Also, the reported data should be confirmed performing a logic synthesis and a simulation.



Other report files and information files are also created (.err, .SUMM, .tips). The .err files contains any possible synthesis error, warnings and tips to improve the synthesis results.

3. Open ave8.err file and review the warning, information and tips messages



# **Review of Synthesis Result**

The synthesis result can be reviewed in different ways.

1. Schematic View: Right-click 'light-blue' folder→ Show Datapath



This opens the CWB's datapath viewer. Clicking on any part of the schematic will also highlight the source code in CWB that corresponds to it (cross referening)

2. Signal Table: Right-click 'light-blue' folder→ Show Signal Table



The Signal table shows the timing diagram of the synthesis. When inputs and outputs are being read or written and when registers accessed. State 00 = reset state and state 01 = stating doing the computation.

# **Scheduling Result**

You can review the scheduling result, by enabling a synthesis option to show the scheduled of the synthesized circuit. In Information → select Display Data Flow Graph



Re-synthesize the circuit. The CDFG will be displayed. The synthesis will continue after the window is closed.



# **Logic Synthesis**

It is now possible to call Altera's or Xilinx's logic synthesizers from within CWB. For this, first we need to create a synthesis script for the target FPGA vendor. Then the logic synthesizer can be executed by either opening its IDE (GUI) or from the command line interface (CUI).

**NOTE:** Make sure that CWB has the path to Quartus setup in CWB's GUI toolbar: Tools→Options→Tool Path

1. Create Logic synthesis script: Right click on light blue process folder → Logic synthesis options/Generate script →Execute

This creates



This will create a .tcl script file for Quartus II. The newly generated <u>red folder</u> includes all the information regarding the logic synthesis



Note: You might need to set up the path for the Altera tools in Tools→ Options→ Tool Path

2. Logic Synthesis: Right clicking on the red folder → Logic synthesis (CUI) will launch Quartus in command mode. Quartus will start and the Quartus outputs will be displayed on CWB's console window



Once Quartus finishes, the logic synthesis results can be observed at the report folder.

## **Design Verification: Software Simulation**

There are different levels of simulations that can be used to verify the design:

- o Pure SW simulation: normal SW simulation
- o Behavioral simulation: untimed simulation (like SW), but considering the HW data types (ter, var, reg)
- O Cycle-accurate simulation: timed simulation of the scheduled synthesis results
- o RTL simulation
- 1. The first step is to do a normal SW simulation from within CWB. Chane to the 'Project' tab → Create Simulation Scenario.



- 2. Select "Behavioral" and C/C++ and call the scenario name 'ave8\_sw'
- 3. Select the top process (check the check box)



4. Right-click on 'ave8\_sw" folder → options→ Build tab -> Add C to the preprocessor definitions



5. Add the indata.txt input stimuli file to the simulation envirnment



6. Build the binary (.exe file) and simulate it

The outdata.txt file with the SW simulation values will be generated. This will be our 'golden' output against which the simulation results throughout the different synthesis stages will be compared against.

# **Cycle-Accurate simulation**

In order to verify the corretnes of the timing and also the accurate performance of the synthesized designs two options are available: (1) simulate the generated RTL code or (2) created a cycle-accurate model. This second options is normally preferred as the simulation is faster than RTL.

1. Create a new simulation scenario (right-click on the 'simulation work' folder. Select 'Cycle accurate' options and enter scenario name (ave8 cycle)



2. Spefy testbench options (right click → options)









- 3. Copy the indata.txt and outdata.txt file into the new simulation folder and rename as in0.clv and out0.clv. CWB needs an input and output file with the same name as the ports as input stimuli.
- 4. Build and simulate the model. The console window will display if the outputs from the cycle-accurate simulation match the outputs from the SW simulation



5. A .vcd file which can be opened by a waveform viewer (e.g. GTKwave) is generated. Double click on it to open it.



The simulation output should match the result of the SW simulation.

### RTL simulation

The last verification step is to simulate if the RTL generated also matches the 'golden' output.

1. Create a new simulation scenario and select RTL. Name the scenario ave8 rtl



2. Generate the testbench: Select RTL simulator and input stimuli every cycle, Input data from file and radix as decimal. Click OK and run the simulation



# **Design Space Exploration**

C-Based design allows the generation of different architectures with differen area vs. performance constraints without having to modify the actual C code. This is mainly done by modifying the synthesis constraints. E.g. FCNT constraint file or synthesis options.

1. Open the synthesis window to observe the synthesis result of the current design in the design space exploration window. Right-click on 'light-blue' folder→ Show synthesis results→ Synthesis window



The Y-axis represents the area of the design, while the y-axis can be modified to represent different things. Modify to 'Latency index'

2. Open the ave8-auto.FCNT resource constraint file and reduce the number of FUs to 1



3. Apply the changes

4. Set manual FCNT generation so that CWB does not generate a new FCNT file when re-synthesizing



5. Re-synthesize the design. A new design with different Area and latency characteristics is generated and plotted in the synthesis window



Opening the schematic viewer confirms that only 2 FUs have been instantiated. The synthesized results maximizes resource sharing.



Opening the signal table shows that the latency has now increased from 1 cycle to 5



Re-verify the new design. No new simulation scenarios are needed. CWB will automatically re-generate the RTL, testbenches and cycle-accurate simulation models.

### **Transactional Level Simulation**

The input and output files has the extension .clv because each data in them represent an input/output for that specific port each cycle. In this new design an input should be only be read every 4 clock cycles. Two options are available:

- Modify the in0.clv file inserting 0 0 0 every data
- Using transactional level simulation. In this case the synthesizer will generate a valid signal for
  each port and set the signal when a new input is needed. The automatically generated testbench
  will monitor the values of these valid signals and only apply a new testvector when the valid
  signal for that particular port is set.
- 1. Copy the .clv files and rename them as .tlv (e.g in0.tlv and out0.tlv)
- 2. Open the synthesis options dialog window→ detail settings→ Circuit specifications 1→ set 'Valid signal for each port is generated as an output port.



3. Re-synthesize the design and re-run a cycle-accurate simulation.

[END]